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Abstract 

Matching cells over time has long been the most dif- 
ficult step in cell tracking. In this paper, we approach 
this problem by recasting it as a classification problem. 
We construct a feature set for each cell, and compute a 
feature difference vector between a cell in the current 
frame and a cell in a previous frame. Then we deter- 
mine whether the two cells represent the same cell over 
time by training decision trees as our binary classifiers. 
With the output of decision trees, we are able to formu- 
late an assignment problem for our cell association task 
and solve it using a modified version of the Hungarian 
algorithm. 



1 Introduction 

Researchers have been creating artificial magneto- 
tactic Tetrahymena pyriformis (T. pyriformis) cells by 
the internalization of iron oxide nano particles, and 
controlling them with a time-varying external magnetic 
field OO. To perform the multi-cell control task, it is 
necessary to track the real-time position of each cell and 
use it as the feedback for the control system. However, 
this problem is very difficult because: the T. pyriformis 
cells are moving fast in the field of view; several cells 
may overlap and occlude each other; the appearances 
of different cells can be similar; and the same cell may 
change in appearance over time (4j. 

Decision tree was proposed as a powerful machine 
learning technique, which can be used for both classifi- 
cation and regression, and has been successfully applied 
on practical systems such as the Kinect gaming platform 
l6l . Since the decision tree can take very complicated 
features as its input and at the same time, once a deci- 
sion tree is trained, the decision procedure can be very 
fast, it is ideal for real-time classification. We use de- 
cision trees as a classifier to determine whether regions 
segmented in neighboring and non-neighboring frames 
represent the same T. pyriformis cell. 



2 Background Subtraction 

To extract the foreground regions of cells, we first 
perform a background subtraction at each frame. There 
are a number of well-known background subtraction al- 
gorithms such as adaptive Gaussian mixture models. 
Since the background in our video is simple and con- 
sistent over time, we simply apply a median filter on 
the time dimension for the first 20 frames to obtain 
the background image. At each frame, we subtract the 
background image from the current frame and take the 
absolute value to obtain a difference image. Then we 
apply thresholding, fill the morphological holes, and ex- 
tract the connected regions as candidate T. pyriformis 
cells. 

3 Feature Extraction 

After the region of each cell is segmented, we extract 
features for each cell according to their shapes, gray- 
level intensities, and the combination of both. Exam- 
ples of the T. pyriformis cell appearance in our video are 
shown in Figure [T] (pixel spacing = 2.32/2.32 um). As- 
sume a cell C in one frame consists of N pixels I(x^), 
where i = 1, 2, . . . , N, and x^ = (x^, yi) are the coor- 
dinates of the zth pixel of C. We can define some useful 
features for this cell, as follows. 

Spatial Features The cell's area N and centroid x c = 

(x c , y c ), where x c = xl and y c = yl, are the most basic 
features. Another useful spatial feature is the nth order 
normalized inertia |1 ], which measures the circularity 
of the shape, and is defined as 
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Histogram Features Using only pixel intensities 
without spatial positions, we can compute several his- 
togram features. First we have the mean and the stan- 




Figure 1: Examples of the T. pyriformis cell appearance. 



4 Decision Trees 

To use decision trees for the cell association prob- 
lem, we first need to map the cell association problem 
to a classification problem. For each cell L in a pre- 
vious frame and each cell C in the current frame, we 
need to know whether they are the same cell. Thus the 
classification problem is to develop a binary classifier / 
such that f(C,L) = 1 if C and L are the same cell, and 
f(C,L) = if they are not. 
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Then we have the standard moments such as the skew- 
ness 7(C) and the kurtosis (3(C): 
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where E[] denotes expectation. We also use the nth 
order root of the nth order central moment M n (C): 



M n (C) = ^E[(I(x % )-^C)) n ] 



(6) 



Composite Features Finally we can compute some 
features that use both spatial information and pixel in- 
tensities at the same time to capture the spatial distribu- 
tion of pixel intensities. We first define the polynomial 
distribution feature P n (C)\ 

1 N 
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Then we define the Gaussian distribution feature 

G n {C): 
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These composite features can be viewed as weighted 
means of pixel intensities. For example, P n (C) gives 
larger weights to pixels far away from the center while 
G n (C) gives larger weights to pixels near the center. 



4.1 Feature Difference Vector 

To simplify the input to the classifier, we define 
a 23 -dimensional feature difference vector v(C, L) = 
(^l? v 2, • • • ? ^23) for C and L. Each entry of this feature 
difference vector reflects the difference of the two cells 
in some aspect. v\ is the distance between the center of 
C and the predicted center of L in the current frame. A 
simple constant velocity assumption is used for the pre- 
diction. For example, suppose the current frame is the 
kth frame and the center of C is x c (C), the center of L 
at the (k — l)th frame was x c (L) and at the (k — 2)th 
frame was x' c (L). Then the predicted center of L at the 
current frame is x p (L) = 2x c (L) — x' c (L), and v\ is 
defined as 

^ = ||x c (C)-x p (L)||. (9) 

V2 is defined as the distance between the centers of the 
two cells: 

^2 = ||x c (C)-x c (L)||. (10) 

Each of the other 21 entries of v(C, L) is defined as the 
absolute value of the difference of some feature of C 
and L. For example, let N(C) and N(L) be the areas 
of C and L, respectively. Then we have 



^3 = \N(C)-N(L)l 
v 4 = KC)-/i(L)|. 
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Similarly, v$ to vj are defined using Equations J5]) to 
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([5]) respectively; v$ to vu are defined using 
n = 3,4,5,6; V12 to vi$ are defined using 
n = 1,2,3,0.5; viq to ^19 are defined using 
n = 1,2,3,0.5; and V20 to V23 are defined using ([8]) 
for n = 2, 4, 6, 8. With these definitions, our classifier 
/ can be viewed as a mapping from the 23 -dimensional 
space to the {0, 1} set. Thus we denote it as /(v(C, L)). 

4.2 Training a Decision Tree 

As shown in Figure [2] a decision tree T consists of 
leaf nodes and split (non-leaf) nodes. Each split node 
consists of a threshold r and an indicator k for a fea- 
ture Vk, where k = 1, 2, . . . , 23. To classify the feature 
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Figure 2: Decision tree example. A decision tree T con- 
sists of split nodes (blue) and leaf nodes (green). The 
red arrows show the branching path for an input v. 

difference vector v, one starts from the root and repeat- 
edly compares its fcth entry with the threshold r to 
decide whether to branch to the left sub-tree (if vu < r) 
or right sub-tree (if Vk > r). The leaf node is a quantity 
Pt(v) indicating the probability of / (v(C, L)) = 1. 

Since it is easy for a human to distinguish whether 
two regions in different frames represent the same cell, 
we can manually assign labels {y} to cell pairs from 
neighboring frames in training videos and associate 
these labels with extracted feature difference vectors 
{v}. In this way we build our training data set D — 
{(v,2/)}, where?/ = /(v). 

The training process follows a maximal entropy re- 
duction procedure | 6 ]: 

1. Beginning from the root, consider a large set of 
splitting candidates (fc,r) which covers all pos- 
sible fc and provides sufficiently delicate subdivi- 
sions for each v^. 

2. For each candidate (fc, r), we partition the training 
set D = {(v, y)} into two subsets: 

A(fc,r) = {(v,J/)hfe<r}, (13) 
D r (k,r) = D\D t (k,T). (14) 

3. Find the candidate (fc,r) that maximizes the en- 
tropy reduction G(k,r): 

(fc*, r*) = argmax G(fc, r), (15) 

(fe,r) 



G(fc,r) = g(J)- J ^ TJI g(A(fc,r)) 

-^(^ftr)), (16) 

where iJ(-) denotes the Shannon entropy and | • | 
denotes the cardinality of a set. 



4. Use (fc*, r*) as the feature indicator and threshold 
at the current split node, and repeat the above steps 
for the left sub-tree with Di(k*,r*) and right sub- 
tree withL> r (fc*,r*). 

5. If the depth reaches the maximum^or the size (or 
entropy) of the partitioned subset D at the current 
node is sufficiently small, then we set this node as 
a leaf node, and the probability at this node is 

= \{(v,y)eD\y=l}\ 
\D\ 

In our work, a tree T\ is trained using D = { (v, y)}, and 
another tree T 2 is trained using D' = {(v', y)}, where 
v' = (^3? ^4? • • • ? ^23) is the truncated feature difference 
vector. Since T2 is trained without center position infor- 
mation, we use Ti for cell association in neighboring 
frames and T2 for cell association in non-neighboring 
frames. 

5 Association Rules 

To track cells over time, we maintain a list {Lj} of 
cells that have appeared in previous frames. The cell 
association task is to determine which cell Lj should be 
associated with the cell Ci in the current frame. Each 
cell in the list has a status, being active, occluded, out, 
or new. 

5.1 Association Matrix 

At frame fc, if we have N\ cells {Ci} in the cur- 
rent frame and N2 cells {Lj} in the list, we define an 
Ni x (7V 2 + 2) association matrix A. Each entry of A 
represents the confidence of associating d with a cell 
in the list or considering it as a new cell. 

If 1 < J < iV 2 and the status of Lj is active, new or 
occluded, we define 

A iJ =P T MCi,L j )). (18) 

Let do denote the maximum possible speed of a T. 
pyriformis cell in pixels per frame, and db(Ci) denote 
the distance from d to the image border. A new cell or 
a re-appearing cell must (re-)enter the image from the 
image border. Thus for 1 < j < N2, if the status of Lj 
is out and db(d) > do, we set Aij = —1; if the status 
of Lj is out and db(d) < do, we define 

A iJ = a 1 P T2 (v f (d,L j )), (19) 

where < a\ < lisa constant denoting the diffi- 
dence (as shown in Table [T]) of associating cells in non- 
neighboring frames. 



^4i,7V 2 +i is the confidence of considering d as a 
new cell that has not appeared in previous frames. If 

d b (Ci) > d , we set A ijN2+1 = -1; if d b (Ci) < d , 
we define 



A i,N 2 +i = a 2 exp(-X 1 dl(C i )) : 



(20) 



where < a 2 < 1 and Ai > are two constants. 

To also include the case when one cell is occluded 
by another, we use A^jv 2 +2 to represent the confidence 
that Ci is a region of more than one cell overlapping. 
Let {Lj} be a subset of the list containing only cells 
that have appeared in the (k — l)th frame. We define 
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where < as < 1 and A2 > are two constants, 
d c (Ci,Cj) is the center distance between d and Cj, 
and d p (Ci, L'-) is the distance from the center of d to 
the predicted center of Lj, which is similar to (|9]). The 
right-hand side of §2l\ can be thought of as tne pre- 
dicted number of cells minus the observed number of 
cells near d • 

5.2 Association List 

Once the association matrix A is created, we need 
to associate each d (or each row of A) with a column 
of A. This problem is similar to the standard assign- 
ment problem [3|, but note that: first, A is generally 
not a square matrix; second, more than one d can be 
considered as new or being a region of occluded cells. 
We define an association list £ = (Ci ? C2 ? • • • ? CiVi ) suc h 
that d — j if Ci is associated with the jth column of A. 
Then our problem is to find 



c* 



argmax 
£ 7=1 



(22) 



such that for any 1 < j < N 2 , there is at most one i that 
satisfies Q = j. 

5.3 The Modified Hungarian Algorithm 

When Ni and are both small, the derivative as- 



signment problem (22) can be solved using a brute- 
force search. But when Ni and N 2 are large, the com- 
putational complexity of brute-force search is at least 



O 



(m^{N u N 2 })\ 
\N 1 -N 2 \l 



(23) 



which is unacceptable for real-time tracking. We pro- 
pose a modified version of the Hungarian algorithm Q 



to obtain a suboptimal solution of ( [22] ). The computa- 
tional complexity of the original Hungarian algorithm is 
0((max{7Vi, 7V2}) 3 ), and our modified Hungarian al- 
gorithm can achieve a computational complexity of at 
most 0((max{iVi, 7V 2 }) 3 ■ min{N u N 2 }). The modi- 
fied Hungarian algorithm is given below: 

1 . For any 1 < i < Ni , if the ith row of A satisfies 

argmax A id = j* > N 2 , (24) 

3 

then we associate the ith row with the j*th column, 
and delete the ith row from A to obtain aiVj x 
(N 2 + 2) submatrix A'. 

2. Perform the standard Hungarian algorithm on the 
first N 2 columns of A' to get an association list £. 

3. For 1 < i < N[, compute this value: 



Q(A',i) -max{A- )JV2+1 ,A- )JV2+2 } A[^ 
If maxn(A\i) > 0, let 

i 

i* = argmax Q,(A\i). 



(25) 



(26) 



Then we associate the z*th row of A' with the 
(A^ 2 + l)th column if 7Vs+1 > A' i%N2+2 or the 
(N 2 + 2)th column if otherwise. Now we delete 
the i*th row from A', update 7V-[, and repeat steps 
EltoS 

4. If max Q(A', i) < 0, then the current £ is the final 
association result for the current A! . 

5.4 Updating the List of Cells 

We update the list of cells {Lj} according to the as- 
sociation result of each d : 

• If Ci is associated with a cell Lj in the list (1 < 
Ci ^ N 2 ), we update the status of Lj to active and 
replace all its features with the features of d • 

• If Ci is considered to be a new cell (Q = N 2 + 1), 
we simply add it to the list and set its status to new. 

• If Ci is considered to be a region of overlapping 
cells (Q = N 2 + 2), we search its neighbourhood 
within the radius do for unassociated Lj 's that have 
appeared in the (k — l)th frame. Then we set the 
status of these L/s to occluded, and update only 
their center positions to the center position of d . 

• For any Lj that appeared in the (k — l)th frame 
and is still unassociated after all the above steps: if 
it is within the distance do from the image border, 
we set its status to out; otherwise, we associate it 
with the closest unassociated d. 



(a) (b) (c) 

Figure 3: Resulting cell trajectories of our method on testing videos, (a) Cells getting very close, (b) Two cells 
occluding each other (red vs. blue, and red vs. black), (c) Many cells appearing in the frame at the same time. 



Depth of Tree Error Rate of T\ Error Rate of T2 



5 1.48% 14.02% 

6 1.46% 13.91% 

7 1.69% 12.95% 

8 1.37% 12.49% 

9 1.54% 13.07% 

10 1.55% 13.61% 



Table 1: Performance of decision trees. 

6 Experiments 

To build the training set, we manually label the 
cell regions segmented in 1000 frames from 2 videos, 
where each frame is an 8 -bit gray-level image of size 
640 x 512, and the videos were captured at 8 frames 
per second 0. The typical size of a T. pyriformis cell 
is between 150 and 400 pixels, and the typical speed is 
between 5 and 40 pixels per frame. To evaluate the per- 
formance of the decision trees T\ and T2, we repeat 20 
independent experiments for different depth of the trees, 
and in each experiment we use randomly selected 70% 
of the cell pairs as the training data, and the rest 30% as 
the test data. The number of subdivisions of each fea- 
ture is set to 1000, and the stop- splitting size of D is set 
to 20. The classifier / is configured such that /(v) = 1 
if Pt(v) > 0.5. The resulting average rates of misclas- 
sification are shown in Table [T] and all 23 entries of the 
feature difference vector v turn out to be useful. 

In our tracking experiments, we use decision trees 
of depth 8. The parameters ai, oli, <^3, Ai, A2 and 
do are selected empirically by studying the training 
videos, and the tracking results are not sensitive to mi- 
nor changes of these parameters. Example cell trajecto- 
ries obtained by our method are shown in Figure [3] In 
these examples we set a\ = 0.1, 0L2 = 0.1, as = 0.8, 
Ai = 0.00008, A 2 = 0.00005 and d = 70. 



7 Discussion and Future Work 

This paper has shown a novel cell tracking method 
using decision trees as classifiers for feature difference 
vectors. The misclassification rates of our decision trees 
are very low, and the tracking results of our method 
are robust against cell occlusions. Future work includes 
evaluating this method on more complicated videos in 
which more cells exist in the frame and occlusions of 
many cells can happen. Finally we will integrate our al- 
gorithm into the real-time T. pyriformis control system. 
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